Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add compact() API and auto-rebuild indexes #339

Merged
merged 9 commits into from
May 2, 2022
Merged

add compact() API and auto-rebuild indexes #339

merged 9 commits into from
May 2, 2022

Conversation

staltz
Copy link
Member

@staltz staltz commented Apr 20, 2022

Context: #306 and ssbc/jitdb#199

This PR adds the compact() ("async") API to ssb-db2 which just triggers async-append-only-log compact(). When it's done, it will rebuild all leveldb indexes from scratch, rebuild private indexes, and partially rebuild jitdb indexes. This is all done in a way that is crash-resistant, capable of resuming where it had stopped. And compaction is guaranteed to not happen concurrently with log.stream that builds indexes, or any jitdb query.

@staltz
Copy link
Member Author

staltz commented Apr 20, 2022

There's an error in the test compaction resumes automatically after a crash:

RangeError [ERR_OUT_OF_RANGE]: The value of "offset" is out of range. It must be >= 0 and <= 65534. Received 77082
    at new NodeError (internal/errors.js:322:7)
    at boundsError (internal/buffer.js:85:9)
    at Buffer.readUInt16LE (internal/buffer.js:242:5)
    at Object.readDataLength (/home/runner/work/ssb-db2/ssb-db2/node_modules/async-append-only-log/record.js:23:19)
    at Object.getDataNextOffset (/home/runner/work/ssb-db2/ssb-db2/node_modules/async-append-only-log/index.js:234:31)
    at Stream._handleBlock (/home/runner/work/ssb-db2/ssb-db2/node_modules/async-append-only-log/stream.js:89:37)
    at Stream._resumeCallback (/home/runner/work/ssb-db2/ssb-db2/node_modules/async-append-only-log/stream.js:145:27)
    at Object.getBlock (/home/runner/work/ssb-db2/ssb-db2/node_modules/async-append-only-log/index.js:201:7)
    at Stream._resume (/home/runner/work/ssb-db2/ssb-db2/node_modules/async-append-only-log/stream.js:136:12)
    at Stream._next (/home/runner/work/ssb-db2/ssb-db2/node_modules/looper/index.js:11:9) {
  code: 'ERR_OUT_OF_RANGE'
}

It happens mostly in CI, hard to reproduce locally. Have to solve this crash before the PR can be said to be ready.

@staltz staltz marked this pull request as ready for review May 1, 2022 06:50
@staltz
Copy link
Member Author

staltz commented May 1, 2022

I fixed this error, it was about updateIndexes() (and a log.stream()) happening concurrently with compaction.

When running compact() in ssb-db2, it checks whether there is any active updateIndexes() going on, postponing the compact if there are.

However, if we are resuming from a crash, then compaction is started automatically by async-append-only-log, and then updateIndexes() is also started automatically in ssb-db2. So I fixed this by adding an if condition in updateIndexes() which checks whether compaction is ongoing. This works because AAOL resumes compaction synchronously, while updateIndexes() is always started asynchronously (it waits for a promise to load, so we know that it happens later).

db.js Show resolved Hide resolved
@ssbc ssbc deleted a comment from github-actions bot May 1, 2022
@staltz
Copy link
Member Author

staltz commented May 1, 2022

@arj03 Although this is the final task for ssb-db2, I have still 2 TODOs:

  • EBT: we need to update or delete some entries after we delete msgs from the log. To keep it simple, let's suppose we only use sbot.db.deleteFeed and thus we delete that whole feedId from EBT too
  • What about browser compatibility for AAOL/compaction.js and DB2 post-compact files?

@ssbc ssbc deleted a comment from github-actions bot May 1, 2022
@staltz staltz requested a review from arj03 May 1, 2022 07:08
db.js Show resolved Hide resolved
@@ -774,7 +879,6 @@ exports.init = function (sbot, config) {
getState: () => state,
getIndexes: () => indexes,
getIndex: (index) => indexes[index],
clearIndexes,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was something I used in browser core for testing. I don't mind removing this. It's a thin wrapper around indexes reset anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to bring it back?

db.js Show resolved Hide resolved
README.md Show resolved Hide resolved
@ssbc ssbc deleted a comment from github-actions bot May 2, 2022
Copy link
Member

@arj03 arj03 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 🚀

@ssbc ssbc deleted a comment from github-actions bot May 2, 2022
@staltz
Copy link
Member Author

staltz commented May 2, 2022

Feels crazy to finally merge this, but here we go!

@staltz staltz merged commit 7689958 into master May 2, 2022
@staltz staltz deleted the compact branch May 2, 2022 13:24
@github-actions
Copy link

github-actions bot commented May 2, 2022

Benchmark results

Part Duration
add 1000 elements 499ms
add 1000 private box1 elements 1339ms
unbox 1000 private box1 elements first run 147ms
unbox 1000 private box1 elements second run 111ms
add 1000 private box1 elements 1364ms
query 1000 elements first run 40ms
query 1000 elements second run 27ms
add 1000 private box2 elements 823ms
unbox 1000 private box2 elements first run 429ms
unbox 1000 private box2 elements second run 441ms
Migrate (+db1) 14870ms
Migrate (alone) 4962ms
Migrate (+db1 +db2) 11202ms
Migrate (+db2) 7505ms
Migrate continuation (+db2) 1090ms
Memory usage without indexes 754.36 MB = 27.38 MB + etc
Initial indexing 705ms
Initial indexing maxCpu=86 4780ms
Initial indexing compat 1197ms
Two indexes updating concurrently 1060ms
key one initial 53ms
key two 5ms
key one again 0ms
reboot and key one again 51ms
latest root posts 660ms
latest posts 37ms
votes one initial 599ms
votes again 1ms
hasRoot 426ms
hasRoot again 0ms
author one posts 357ms
author two posts 19ms
dedicated author one posts 426ms
dedicated author one posts again 0ms
Maximum memory usage 811.09 MB = 45.74 MB + etc
Indexes folder size 9.97mb

@ssbc ssbc deleted a comment from github-actions bot May 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants